R demo | Correlation Matrix | How to conduct, visualise and interpret

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024
  • Having several numeric variables, we often wanna know which of them are correlated and how. Correlation Matrix seems to be a good solution for it. But drawing conclusions from plain correlation coeffitients and p-values is dangerous, if we don’t visualize the data. Let’s learn a better way to produce a correlation matrix.
    Here is a quick R code:
    install.packages("PerformanceAnalytics")
    library("PerformanceAnalytics")
    chart.Correlation(iris[, 1:3])
    install.packages("tidyverse")
    library(tidyverse) # for "aes()"
    install.packages("GGally")
    library(GGally)
    ggpairs(iris,
    columns = 1:3,
    aes(colour=Species),
    lower = list(continuous = "smooth"),
    upper = list(continuous = wrap("cor",
    method = "pearson")))
    If you only want more code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
    Enjoy! 🥳

Комментарии • 21

  • @hikeaway1596
    @hikeaway1596 2 месяца назад

    great practical content! thanks

  • @greenis9163
    @greenis9163 Год назад

    You are life saver!!!

  • @SadatQuayiumApu
    @SadatQuayiumApu 2 года назад +1

    Simpson's paradox !

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 года назад +2

      totally 😁👍 I see it way to often in the medical research

  • @omerutkuerzengin3061
    @omerutkuerzengin3061 2 года назад

    Nice work, thank you.

  • @el-houssainebahouar4269
    @el-houssainebahouar4269 2 года назад

    Thank you Sir!

  • @OFWCREATOR
    @OFWCREATOR 2 года назад

    Thank you Sir!,

  • @myroslavlutsyk5714
    @myroslavlutsyk5714 2 года назад

    "Package ‘fastStat’ was removed from the CRAN repository.
    Formerly available versions can be obtained from the archive.
    Archived on 2021-07-30 as check problems were not corrected in time. "
    Hi, how could we go on with the tutorial?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 года назад +2

      Hi Myroslav, the fastStat works on my computer somehow, but it was not the point of this video. The point was to stop using it, because ... of the reasons I talk about further into the video. I hope you enjoy the rest and will find it useful. There is a link in the description below the video to my blog, where you have R code.

  • @samihahzura4735
    @samihahzura4735 2 года назад

    Great video that can do correlation for each specific type of variables! However, I can't do the normality test as its said no normality function available?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 года назад

      Great point! It's probably because you did not use the code above, where I should how to install and load "dlookr" package. Normality function coms from dlookr. install.packages("dlookr")
      library(dlookr)

  • @hansmeiser6078
    @hansmeiser6078 Год назад

    How can we quickly check for linearity programmatically in code, without plots? Or should wie check for normality of residuals of a lm-fit. Would it be the same?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      you can actually check all the assumptions in one command "check_model()", visually! check out my video on {performance} package.

    • @hansmeiser6078
      @hansmeiser6078 Год назад

      @@yuzaR-Data-Science I saw it multiple times. Actually I want only check for linearity of a vector of numbers, non-visually from within an automation.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Год назад

      one vector can not be checked for linearity, because linear against what? only for normality of distribution can be checked there. with all the other linearity things, you need at least two numeric variable, or, in the regression with multiple variables, you see the second plot of the residuals from check_model() function

    • @hansmeiser6078
      @hansmeiser6078 Год назад

      @@yuzaR-Data-Science Actually I need a linearity-p-value (if exists) of two variables, so I can store it, or process it further. Is there a dedicated package for this?